8 research outputs found
Knowledge Author: Facilitating user-driven, Domain content development to support clinical information extraction
Background: Clinical Natural Language Processing (NLP) systems require a semantic schema comprised of domain-specific concepts, their lexical variants, and associated modifiers to accurately extract information from clinical texts. An NLP system leverages this schema to structure concepts and extract meaning from the free texts. In the clinical domain, creating a semantic schema typically requires input from both a domain expert, such as a clinician, and an NLP expert who will represent clinical concepts created from the clinician's domain expertise into a computable format usable by an NLP system. The goal of this work is to develop a web-based tool, Knowledge Author, that bridges the gap between the clinical domain expert and the NLP system development by facilitating the development of domain content represented in a semantic schema for extracting information from clinical free-text. Results: Knowledge Author is a web-based, recommendation system that supports users in developing domain content necessary for clinical NLP applications. Knowledge Author's schematic model leverages a set of semantic types derived from the Secondary Use Clinical Element Models and the Common Type System to allow the user to quickly create and modify domain-related concepts. Features such as collaborative development and providing domain content suggestions through the mapping of concepts to the Unified Medical Language System Metathesaurus database further supports the domain content creation process. Two proof of concept studies were performed to evaluate the system's performance. The first study evaluated Knowledge Author's flexibility to create a broad range of concepts. A dataset of 115 concepts was created of which 87 (76%) were able to be created using Knowledge Author. The second study evaluated the effectiveness of Knowledge Author's output in an NLP system by extracting concepts and associated modifiers representing a clinical element, carotid stenosis, from 34 clinical free-text radiology reports using Knowledge Author and an NLP system, pyConText. Knowledge Author's domain content produced high recall for concepts (targeted findings: 86%) and varied recall for modifiers (certainty: 91% sidedness: 80%, neurovascular anatomy: 46%). Conclusion: Knowledge Author can support clinical domain content development for information extraction by supporting semantic schema creation by domain experts
Surface salinity measurements - COSMOS 2005 experiment in the Bay of Biscay
12 páginas, 7 figuras, 2 tablas.Sea surface salinity (SSS) data were collected in the Bay of Biscay between April and November 2005. The major source of data is 15 surface drifters deployed during the COSMOS experiment in early April and early May 2005 [12 from the Scripps Instution of Oceanography (SIO) and 3 from METOCEAN]. This is complemented by thermosalinograph (TSG) data from four French research vessels and four merchant vessels, from salinity profiles collected by Argo profiling floats and CTD casts, and from surface samples during two cruises. Time during the two cruises was dedicated to direct inspection of the drifters, recovering some, and providing validation data. This dataset provides a unique opportunity to estimate the accuracy of the SSS data and to evaluate the long-term performance of the drifter salinities. Some of the TSG SSS data were noisy, presumably from bubbles. The TSG data from the research vessels needed to be corrected from biases, which are very commonly larger than 0.1 pss-78 (practical salinity scale), and which in some instances evolved quickly from day to day. These corrections are only available when samples were collected or ancillary data are available (e.g., from CTD profiles). The resulting accuracy of the corrected TSG dataset, which varies strongly in time, is discussed. The surface drifter SSS data presented anomalous daytime values during days with strong surface warming. These data had to be excluded from the dataset. The drifter SSS presented initial biases in the range 0.009 to −0.026 pss-78. The (usually) negative bias increased by an average of −0.007 pss-78 during the average 65-day period before the COSMOS-2 cruise on 22–27 June. High chlorophyll derived from satellite ocean color, and therefore high density of phytoplanktonic cells, is observed in Medium Resolution Imaging Spectrometer (MERIS)/Moderate Resolution Imaging Spectroradiometer (MODIS) composites during part of the period, in particular in late April or early May. No correlation was found between the change in bias and the estimated surface chlorophyll. Evolution during the following summer months is harder to ascertain. For three buoys, there is little change in bias, but for two others, there could have been an increase in bias by up to 0.03 or 0.04 pss-78 during July–August. Seven drifters were recovered in the autumn, which provide recovery or postrecovery estimates of the biases, suggesting in three cases (out of seven) a large (0.02–0.03 pss-78) increase in bias during the autumn months, but no significant increase for the other four drifters.Funding for the research was provided by CNES.Peer reviewe
Developing a web-based SKOS editor
BACKGROUND: The Simple Knowledge Organization System (SKOS) was introduced to the wider research community by a 2005 World Wide Web Consortium (W3C) working draft, and further developed and refined in a 2009 W3C recommendation. Since then, SKOS has become the de facto standard for representing and sharing thesauri, lexicons, vocabularies, taxonomies, and classification schemes. In this paper, we describe the development of a web-based, free, open-source SKOS editor built for the development, curation, and management of small to medium-sized lexicons for health-related Natural Language Processing (NLP). RESULTS: The web-based SKOS editor allows users to create, curate, version, manage, and visualise SKOS resources. We tested the system against five widely-used, publicly-available SKOS vocabularies of various sizes and found that the editor is suitable for the development and management of small to medium-size lexicons. Qualitative testing has focussed on using the editor to develop lexical resources to drive NLP applications in two domains. First, developing a lexicon to support an Electronic Health Record-based NLP system for the automatic identification of pneumonia symptoms. Second, creating a taxonomy of lexical cues associated with Diagnostic and Statistical Manual of Mental Disorders (DSM-5) diagnoses with the goal of facilitating the automatic identification of symptoms associated with depression from short, informal texts. CONCLUSIONS: The SKOS editor we have developed is - to the best of our knowledge - the first free, open-source, web-based, SKOS editor capable of creating, curating, versioning, managing, and visualising SKOS lexicons
Surface salinity measurements - COSMOS 2005 experiment in the Bay of Biscay
International audienc
Examining the Effects of Political Information and Intervention Stages on Public Support for Military Interventions: A Panel Experiment
Examining the effects of political information and intervention stages on public support for military interventions: A panel experimen
Recommended from our members
A Framework for Leveraging "Big Data" to Advance Epidemiology and Improve Quality: Design of the VA Colonoscopy Collaborative.
ObjectiveTo describe a framework for leveraging big data for research and quality improvement purposes and demonstrate implementation of the framework for design of the Department of Veterans Affairs (VA) Colonoscopy Collaborative.MethodsWe propose that research utilizing large-scale electronic health records (EHRs) can be approached in a 4 step framework: 1) Identify data sources required to answer research question; 2) Determine whether variables are available as structured or free-text data; 3) Utilize a rigorous approach to refine variables and assess data quality; 4) Create the analytic dataset and perform analyses. We describe implementation of the framework as part of the VA Colonoscopy Collaborative, which aims to leverage big data to 1) prospectively measure and report colonoscopy quality and 2) develop and validate a risk prediction model for colorectal cancer (CRC) and high-risk polyps.ResultsExamples of implementation of the 4 step framework are provided. To date, we have identified 2,337,171 Veterans who have undergone colonoscopy between 1999 and 2014. Median age was 62 years, and 4.6 percent (n = 106,860) were female. We estimated that 2.6 percent (n = 60,517) had CRC diagnosed at baseline. An additional 1 percent (n = 24,483) had a new ICD-9 code-based diagnosis of CRC on follow up.ConclusionWe hope our framework may contribute to the dialogue on best practices to ensure high quality epidemiologic and quality improvement work. As a result of implementation of the framework, the VA Colonoscopy Collaborative holds great promise for 1) quantifying and providing novel understandings of colonoscopy outcomes, and 2) building a robust approach for nationwide VA colonoscopy quality reporting